The use of Diphone Variants in Optimal Text Selection for Finnish Unit Selection Speech Synthesis
نویسندگان
چکیده
The speech quality of a unit selection speech synthesizer depends highly on the database. This paper describes an approach for sentence selection for Finnish speech database recordings aiming at optimal coverage. The main idea is to define the diphone in a slightly different way: to distinguish diphones consisting of different allophones and also different linguistic positions, i.e. intraand inter-syllabic diphones. We call these diphone variants. We evaluated if diphone variants become included in text selection for TTS prompt design without separate optimization and coarsely verified their acoustic dissimilarity. With the same number of sentences (292) that fulfill the traditionally determined diphone coverage completely, 66% more allophonic and inter/intra-syllabic contexts were missing with the conventional method compared to the proposed approach. We also describe how the approach inspired the synthesis process to reduce computational load.
منابع مشابه
Building a Finnish unit selection TTS system
Speech synthesis based on unit selection can produce far more natural speech than conventional diphone-based methods. Unit selection based text-to-speech synthesizers have been built for many different languages. In this paper, we describe the development of TUT VOICE, the first Finnish unit selection synthesis engine for academic research. The system includes database construction, synthesis e...
متن کاملStudy on Unit-Selection and Statistical Parametric Speech Synthesis Techniques
One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...
متن کاملHalfphones: A Backoff Mechanism for Diphone Unit Selection Synthesis
Diphone Backoff mechanisms in text-to-speech provide a means of ensuring that synthesis of the text takes place, even if some of the diphones in the text are missing in the speech database. This paper describes an automatic method for synthetically creating missing diphones from halfphones that are in the speech database.
متن کاملDiphone synthesis using unit selection
This paper describes an experimental AT&T concatenative synthesis system using unit selection, for which the basic synthesis units are diphones. The synthesizer may use any of the data from a large database of utterances. Since there are in general multiple instances of each concatenative unit, the system performs dynamic unit selection. Selection among candidates is done dynamically at synthes...
متن کاملRobust Unit Selection System for Speech Synthesis
There has been much interest for many years in diphone-based concatenative speech synthesis and, recently, a rapidly increasing interest in unit selection based synthesis (as illustrated by the CHATR [2] system). However, the limitations of both types of system are well known. While intelligibility is generally very high for diphone based systems, the resulting signals do not sound completely n...
متن کامل